Skip to main content

[Tutorial] Building an Enterprise-Grade High-Availability PostgreSQL Cluster from Scratch

342.webp

Author's Preface

This article is reproduced from my blog, original link: [Tutorial] Building an Enterprise-Grade High-Availability PostgreSQL Cluster from Scratch

Before we begin, please note the following:

  1. This article assumes you have completed [Tutorial] Building a Minimal High-Availability k3s Cluster from Scratch
  2. The development environment used in this article is ArchLinux + Fish Shell. Due to cross-system command syntax differences, please adjust command formats according to your actual situation.
  3. The object storage service used in this article is S3-compatible. Please adjust configuration methods according to your actual situation.
  4. The commands in this article depend on various CLI tools (marked in the text or in prerequisite tutorials; common Linux CLI tools are not marked here). Since the installation process for CLI tools is tedious and installation steps vary across platforms, we will not elaborate further. Please refer to the official documentation for installation tutorials.

Introduction

PostgreSQL is the most powerful open-source relational database (specific advantages have been discussed extensively online and will not be repeated here), and it is also the relational database I use most frequently (without exception). Therefore, in my tutorials, the relational database used is generally PostgreSQL as the primary choice. You can view this tutorial as a very basic prerequisite tutorial in Kubernetes-related tutorials.

Next, let's discuss CloudNativePG (CNPG). What is it? In short, you can view it as a specialized version of PostgreSQL for Kubernetes, which greatly simplifies the deployment and management process of PostgreSQL. At the same time, to meet enterprise-grade requirements, it provides many advanced features, including high availability (HA), automated backup, disaster recovery drills, automated scaling, and more.

This tutorial will be based on k3s + Longhorn, using CloudNativePG and a generic S3-compatible object storage to build from scratch a high-availability PostgreSQL cluster. We will demonstrate basic backup and operations, and finally, we will manually create failures to test the database's automated recovery capabilities.

Prerequisites

Before starting deployment, please ensure the following prerequisites are met:

Infrastructure

First, you need to prepare a minimal high-availability k3s cluster similar to the one in [Tutorial] Building a Minimal High-Availability k3s Cluster from Scratch

CLI Tools

Before starting the tutorial, you need to prepare the following CLI tools:

  1. kubectl, used for cluster management. In the previous tutorial, we have already installed kubectl and configured management permissions for remote clusters (kubeconfig)
    • Verification command:
      kubectl version --short
    • Ideal output: displays Client Version: <client_version>; when normally connected, it will additionally return Server Version: <server_version>
  2. helm, used for deploying Operator. In the previous tutorial, we have already installed helm and configured management permissions for remote clusters (i.e., kubeconfig)
    • Verification command:
      helm version
    • Ideal output: the version.BuildInfo field contains information such as Version: <helm_version>, GitCommit: <commit_hash>, etc.
  3. kubeseal, used for encrypting Kubernetes Secrets. Please refer to the official tutorial for installation
    • Verification command:
      kubeseal --version
    • Ideal output: outputs kubeseal version: <kubeseal_version>
  4. openssl, used for generating random passwords. This is generally included with the system. If not, you can refer to the official tutorial for installation
    • Verification command:
      openssl version
    • Ideal output: outputs OpenSSL version information (e.g., OpenSSL <openssl_version> ...)

Third-Party Infrastructure

You need to prepare an S3-compatible object storage service account (such as AWS S3, Wasabi, Backblaze B2, or self-hosted MinIO) and ensure you can create Buckets and access keys.

Deployment Architecture Overview

In this tutorial, we will implement the following:

  1. Run CloudNativePG Operator in the cnpg-system namespace.
  2. Create a 3-replica PostgreSQL cluster (1 Primary + 2 Replicas) in the data-infra namespace.
  3. Enable barman continuous backup to object storage and expose a ReadWrite Service for applications.
  4. Use cloudflared sidecar containers to securely expose the database to the public internet for development and testing.
  5. Operations drills to test database rolling updates and plugin installation.
  6. Disaster drills to randomly crash nodes and observe automatic recovery capabilities. Test rebuilding the database from backup data in object storage under extreme conditions.
  7. Implement observability by monitoring database status through Prometheus and Grafana.

349.webp

Step 1: Install CloudNativePG Operator

The official recommendation is to deploy the Operator using Helm. First, add the Helm repository and complete the installation:

helm repo add cloudnative-pg https://cloudnative-pg.github.io/charts
helm repo update

kubectl create namespace cnpg-system

helm upgrade --install cloudnative-pg cloudnative-pg/cloudnative-pg \
--namespace cnpg-system \
--set controllerManager.resources.requests.cpu=200m \
--set controllerManager.resources.requests.memory=256Mi \
--set controllerManager.resources.limits.cpu=500m \
--set controllerManager.resources.limits.memory=512Mi

Results as shown:

353.webp 354.webp

Installation command parameter explanation:

Parameter/SnippetFunctionRemarks
helm upgrade --installUpgrade if Release exists, otherwise install for the first timeKeep deployment command idempotent
cloudnative-pgName of the Helm ReleaseUsed by Helm to track this deployment
cloudnative-pg/cloudnative-pgRepository and name of the ChartRequires prior helm repo add cloudnative-pg …
--namespace cnpg-systemKubernetes namespace to useOperator will run in this namespace
--set controllerManager.resources.requests.cpu=200mMinimum CPU requested by Controller ManagerEnsure scheduling of at least 0.2 cores
--set controllerManager.resources.requests.memory=256MiMinimum memory requested by Controller ManagerEnsure scheduling of at least 256Mi
--set controllerManager.resources.limits.cpu=500mCPU limit for Controller ManagerWill be throttled if exceeding 0.5 cores
--set controllerManager.resources.limits.memory=512MiMemory limit for Controller ManagerMay trigger OOM if exceeding 512Mi

If you want to fix the Chart or image version, you can explicitly add --version <chartVersion> and --set image.tag=<controllerTag>. When these parameters are omitted, Helm will use the current latest stable version in the repository and the default image specified in the Chart.

Scheduling mechanism supplement: The controllerManager Pod is scheduled by Kubernetes Scheduler, which first filters out nodes that do not meet resource requests (at least 200m CPU and 256Mi memory), then scores based on current cluster load, and finally selects the most suitable node. Without additional nodeSelector, taint tolerance, or affinity settings, it may land on any node that meets the conditions, but the result is calculated based on algorithms and node status, not completely random.

If your cluster has network policies enabled such as kube-router or Cilium, ensure that the controller in the cnpg-system namespace can access the Kubernetes API and target namespaces.

After installation is complete, verify that the Operator is ready:

kubectl get pods -n cnpg-system
kubectl get crds | grep postgresql.cnpg.io

You should see that CRDs such as cluster.postgresql.cnpg.io are registered and the Operator Pod is in Running state.

355.webp

Step 2: Prepare Namespace and Access Credentials

Install Sealed Secrets

In this tutorial, we need to use Sealed Secrets to manage sensitive information. Before using Sealed Secrets, you need to deploy the controller in the cluster and install the kubeseal CLI locally:

FAQ

  • Q: Why introduce Sealed Secrets? A: Sealed Secrets can encrypt Kubernetes Secrets into manifests that can be safely stored in Git. Even if stored in a repository, plaintext will not be leaked. The controller on the cluster side will unseal and generate real Secrets according to permissions, allowing sensitive configurations to use standard GitOps processes for synchronization.
helm repo add sealed-secrets https://bitnami-labs.github.io/sealed-secrets
helm repo update

kubectl create namespace sealed-secrets

helm upgrade --install sealed-secrets sealed-secrets/sealed-secrets \
--namespace sealed-secrets \
--set fullnameOverride=sealed-secrets-controller

The above command will run the controller in the sealed-secrets namespace with a fixed name of sealed-secrets-controller, making it convenient for subsequent examples to reference. If the cluster already has an existing deployment, you can skip this step.

Next, install kubeseal locally and execute kubeseal --version to verify the version. Please refer to the official installation tutorial.

363.webp

Note: kubeseal by default looks for a service named sealed-secrets-controller in the kube-system namespace. If deployed in the sealed-secrets namespace like the example without specifying parameters, you will see error: cannot get sealed secret service: services "sealed-secrets-controller" not found. Therefore, subsequent commands must explicitly pass --controller-namespace sealed-secrets --controller-name sealed-secrets-controller, or adjust according to the actual installation location.

FAQ

  • Q: Why does running kubeseal show error: cannot get sealed secret service: services "sealed-secrets-controller" not found? A: kubeseal looks for a service with the same name in the kube-system namespace by default. In the example, we deployed the controller in the sealed-secrets namespace, so we need to add --controller-namespace sealed-secrets --controller-name sealed-secrets-controller, otherwise it won't be found.
  • Q: Then why not directly deploy the service in the kube-system namespace where kubeseal can find it by default? A: Mainly for isolation and easier maintenance. kube-system is generally occupied by cluster underlying components, keeping this namespace clean helps with troubleshooting. At the same time, an independent sealed-secrets namespace can centrally manage related resources, and it's also convenient for permission control in multi-environment or multi-tenant scenarios. The tool defaults to pointing to kube-system for compatibility with common installation methods; we just need to pass parameters according to the actual deployment.

Generate Business Secrets

Prepare a separate namespace for the database layer and use Sealed Secrets to manage sensitive information, avoiding mixing miscellaneous resources in the same space:

kubectl create namespace data-infra

Generate encrypted Secrets for database application accounts:

Step 1: Generate the original Secret manifest (will not be written to the cluster)

kubectl create secret generic cnpg-app-user \
--namespace data-infra \
--from-literal=username=app_user \
--from-literal=password="$(openssl rand -base64 20)" \
--dry-run=client -o json > cnpg-app-user.secret.json

This command encapsulates the username and random password into a Kubernetes Secret JSON manifest and writes it to cnpg-app-user.secret.json, while keeping it side-effect-free to the cluster. The generated file content is similar to:

{
"kind": "Secret",
"apiVersion": "v1",
"metadata": {
"name": "cnpg-app-user",
"namespace": "data-infra"
},
"data": {
"password": "random-password",
"username": "random-username"
}
}

Step 2: Use kubeseal to encrypt into a SealedSecret

kubeseal \
--controller-namespace sealed-secrets \
--controller-name sealed-secrets-controller \
--format yaml < cnpg-app-user.secret.json > cnpg-app-user.sealed-secret.yaml

Here we manually read the JSON from the previous step and pass it to kubeseal. The namespace and controller parameters ensure that the CLI can connect to our deployed Sealed Secrets controller. The --format yaml outputs a SealedSecret file that can be submitted to Git. The generated file content is similar to:

---
apiVersion: bitnami.com/v1alpha1
kind: SealedSecret
metadata:
name: cnpg-app-user
namespace: data-infra
spec:
encryptedData:
password: random-password(locked)
username: random-username(locked)
template:
metadata:
name: cnpg-app-user
namespace: data-infra

Step 3: Apply the SealedSecret to the cluster and delete the original Secret

kubectl apply -f cnpg-app-user.sealed-secret.yaml
rm cnpg-app-user.secret.json

After the controller receives it, it will automatically unseal and generate a regular Secret with the same name (cnpg-app-user) for subsequent database resources to reference. Keep cnpg-app-user.sealed-secret.yaml safe for use in subsequent CI/CD processes.

Prepare S3-Compatible Object Storage Credentials

  1. Log in to the object storage service console (such as AWS S3, Wasabi, Backblaze B2, or self-hosted MinIO cluster), create a dedicated Bucket, for example named cnpg-backup, and select the Region closest to your Kubernetes cluster.
  2. In the Bucket's access policy, disable public read/write and only allow credential access. If the provider supports lifecycle or version management, you can configure them in advance according to compliance requirements.
  3. Go to the credentials or key management page, create a pair of Access Keys for this tutorial, granting at least Object Read and Object Write permissions for the target Bucket. If necessary, you can restrict to a specific prefix range.
  4. Record the Access Key ID, Secret Access Key, Region (if applicable) returned by the console, and the S3 Endpoint, for example https://s3.ap-northeast-1.amazonaws.com or a custom domain provided by the provider. Some providers will note whether Path Style access is required; please record this as well.
  5. Save this information to your password manager and prepare the plaintext values to be written to Kubernetes Secret: aws_access_key_id, aws_secret_access_key, region (if Region is not fixed, fill in the provider's recommended value), and confirm the destinationPath to be used in CNPG (such as s3://my-cnpg-backups/production).

After completing the console configuration, you can continue to use Sealed Secrets to manage this set of credentials.

Configure Object Storage Credentials Using SealedSecret

When planning to enable object storage backup, similarly use Sealed Secrets to generate access credentials (using S3-compatible service as an example):

kubectl create secret generic cnpg-backup-s3 \
--namespace data-infra \
--from-literal=aws_access_key_id=AKIA... \
--from-literal=aws_secret_access_key=xxxxxxxxxxxx \
--from-literal=region=ap-northeast-1 \
--dry-run=client -o json \
| kubeseal \
--controller-namespace sealed-secrets \
--controller-name sealed-secrets-controller \
--format yaml > cnpg-backup-s3.sealed-secret.yaml

kubectl apply -f cnpg-backup-s3.sealed-secret.yaml

Step 3: Declare PostgreSQL Cluster

CloudNativePG uses the Cluster custom resource to define the database stack. Below is a production-friendly example (storage class uses Longhorn deployed in the previous tutorial):

apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
name: cnpg-ha
namespace: data-infra
spec:
instances: 3
imageName: ghcr.io/cloudnative-pg/postgresql:16.4
primaryUpdateStrategy: unsupervised
storage:
size: 40Gi
storageClass: longhorn
postgresql:
parameters:
max_connections: "200"
shared_buffers: "512MB"
wal_level: logical
pg_hba:
- hostssl all all 0.0.0.0/0 md5
authentication:
superuser:
secret:
name: cnpg-app-user
replication:
secret:
name: cnpg-app-user
bootstrap:
initdb:
database: appdb
owner: app_user
secret:
name: cnpg-app-user
backup:
barmanObjectStore:
destinationPath: s3://my-cnpg-backups/production
endpointURL: https://s3.ap-northeast-1.amazonaws.com
s3Credentials:
accessKeyId:
name: cnpg-backup-s3
key: aws_access_key_id
secretAccessKey:
name: cnpg-backup-s3
key: aws_secret_access_key
region:
name: cnpg-backup-s3
key: region
retentionPolicy: "14d"
monitoring:
enablePodMonitor: true

Save as cnpg-ha.yaml and apply:

kubectl apply -f cnpg-ha.yaml

CNPG will automatically create StatefulSet, PVC, services, and other resources. Wait for the Pod to start:

kubectl get pods -n data-infra -l cnpg.io/cluster=cnpg-ha -w

When both Primary and Replica are in Running state, the database cluster deployment is complete.

Step 4: Verify High Availability and Service Exposure

Check Services and Connection Information

kubectl get svc -n data-infra -l cnpg.io/cluster=cnpg-ha

kubectl port-forward svc/cnpg-ha-rw 6432:5432 -n data-infra
psql "postgresql://app_user:$(kubectl get secret cnpg-app-user -n data-infra -o jsonpath='{.data.password}' | base64 -d)@127.0.0.1:6432/appdb"

CNPG by default creates -rw (read-write) and -ro (read-only) Services, making it convenient for applications to connect to the primary and replica databases respectively.

Simulate Failover

Force delete the Primary Pod to verify automatic recovery capability:

kubectl delete pod -n data-infra \
$(kubectl get pods -n data-infra -l cnpg.io/cluster=cnpg-ha,role=primary -o jsonpath='{.items[0].metadata.name}')

Within seconds, CNPG will automatically promote a Replica to Primary. Confirm the role change with the following command:

kubectl get pods -n data-infra -L role
kubectl logs -n data-infra -l cnpg.io/cluster=cnpg-ha,role=primary --tail=20

Step 5: Backup and Disaster Recovery Strategy

  • Continuous Backup (PITR): The barmanObjectStore configuration continuously writes WAL to object storage. You can trigger full backups using the kubectl cnpg backup command.
  • Regular Recovery Drills: Create Backup + Cluster CRs in a temporary namespace to verify recovery from object storage to a specific point in time.
  • Monitoring and Alerting: After enabling monitoring.enablePodMonitor, you can scrape key metrics such as cnpg_pg_replication_lag and cnpg_pg_is_in_recovery in Prometheus.
  • Resource Quotas: Configure LimitRange and ResourceQuota for the data-infra namespace to prevent database instances from monopolizing node resources.

Common Operations

  • Online Scaling: Simply adjust the spec.instances count, and CNPG will automatically create new Replicas and synchronize data.
  • Parameter Adjustment: Update spec.postgresql.parameters, and the Operator will trigger a rolling restart to apply the configuration.
  • Version Upgrade: Modify spec.imageName to specify a new version image. CNPG will first create a new Replica, perform a switchover, and then reclaim the old primary.
  • Logical Replication: Through pg_hba and wal_level configuration, you can synchronize specific databases/tables to external data warehouses.

Summary and Next Steps

By now, we have deployed a PostgreSQL cluster with automatic failover, continuous backup, and declarative management capabilities on a high-availability k3s cluster. The next steps you can consider:

  1. Introduce Grafana Loki or ELK to collect database logs and improve observability.
  2. Use CloudNativePG Pooler or PgBouncer to optimize connection pool management.
  3. Combine with GitOps tools like Argo CD/Fleet to incorporate database CRs into continuous delivery pipelines, enabling configuration change review and rollback.